System and method for an open autonomy kernel (oak)

ABSTRACT

The Open Autonomy Kernel (OAK) addresses critical infrastructure requirements for next generation autonomous and semi-autonomous systems ( 24 ), including performance tracking, anomaly detection, diagnosis, fault recovery, and plant “safing”. OAK combines technologies in automated planning and scheduling, control agent-based systems ( 22 ), and model based reasoning to form a portable software architecture ( 26 ), knowledge-base, and open Application Programming Interface (API) to enable integrated auxiliary subsystem autonomy.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. provisional application No.60/295,201, filed on Jun. 1, 2001, which is hereby incorporated byreference in its entirety.

STATEMENT OF GOVERNMENTAL INTEREST

This invention was made with U.S. Government support under the Office ofNaval Research, Arlington, Va. under contract number N00014-00-C-0050.The U.S. Government has certain rights in the invention.

BACKGROUND OF THE INVENTION

Efforts to automate the control of complex connection-based systems suchas, for instance, engineering plants aboard naval vessels haveemphasized the infrastructure and diagnostic aspects of plantmanagement, i.e., monitoring subsystems via sensors and presentation ofthe sensor data to human operators. Interpretation of and response tothe data remain largely manual tasks. This interpretation and responsefunction, especially in damage control scenarios, is a significantfactor in determining manpower levels. If the incident assessment andresponse loop can be closed with a reliable autonomous reasoningprocess, significant relief in overall manpower levels can be realized.The best automation efforts to date have been based on expert diagnosticknowledge in the form of coded rules or procedures that are interpretedby the system at runtime to detect, predict, or diagnose faultconditions. However, even the best automation efforts requiresignificant amounts of human diagnosis and input.

SUMMARY

The open autonomy kernel (OAK) architecture of the present inventionextends an automated reasoning paradigm in a number of important ways.First, OAK is based upon the belief that the next evolutional step incomplex system automation involves goal-directed commanding at thesystem component level. This shifts the control paradigm from one ofsending commands to subsystems, to sending goals and resources tointelligent subsystem management control agents that require minimal, ifany, direct operator interaction. Secondly, subsystem management controlagents are loosely coupled across a distributed, networkedinfrastructure. This results in a dynamic and adaptable coordinationcapability, leading to a more survivable overall system architecture.Finally, OAK uses declarative model-based reasoning as an extension tocurrent rule and procedure based formalisms in the control agent controlloop. Qualitative model-based reasoning provides the unique potentialfor performing real-time detection, identification, and diagnosis ofunanticipated fault conditions.

OAK is based upon an open control agent communication infrastructurethat includes a plurality of control agents coupled with respectivesubsystems. Control agents communicate via a messaging protocol thatpermits them to share their subsystem status and subsystem goals withother control agents. Status and goal data allows control agents tocollaborate in diagnosis, prediction, or event response scenarios. TheOAK infrastructure further supports the notion of a command element thatcan control and establish goals for the control agents. The commandelement may be a human operator, an outside control agent, an externalsoftware system, or the control agent community.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates one example of a logical OAK hierarchy.

FIG. 2 illustrates a sample OAK communication network of control agents.

FIG. 3 illustrates the architecture of a control agent within OAK.

DETAILED DESCRIPTION

The open autonomy kernel (OAK) provides a goal-directed coordinatedintelligence approach with respect to complex connection-based systemarchitectures. Connection-based systems can be thought of as physical orlogical systems that are comprised of well defined components whosebehavior is influenced by other well defined components along onedimensional paths. Examples of connection-based systems include, but arenot limited to, electrical systems, plumbing systems, and computernetworks. In addition, connection-based systems can be combinations offunctionally distinct but interdependent system components such as anelectrical system, a plumbing system, and a computer network that arecomponents of, for instance, an overall engineering plant managementsystem.

The term “components” includes both physical hardware components suchas, but not limited to, a valve, a circuit, or a switch, as well as acohesive collection of such hardware components, such as, but notlimited to, a pumping station, an electrical substation, or a powergrid. Components may also refer to a hardware/software combination thatbehaves as a distinct unit (e.g., a valve and its controller).Components may also comprise other more detailed components. A keydistinction is that the behavior of an individual component is easilydefined from the perspective of peer components interacting with theindividual component.

OAK provides intelligent support for three key features pertaining tooverall system control: estimation, communication, and control. Theprimary building blocks of an OAK system architecture include controlagents (CA), model-based reasoning engines (MBRE), control agentcommunications brokers (ACB), and control mediators (CM).

OAK utilizes two key technologies to perform the aforementionedfunctions. First, control is distributed through the use of controlagent-based software. Secondly, OAK implements intelligent reasoningwithin each control agent using a technology known as Model-BasedReasoning (MBR). A primer of control agent based software andmodel-based reasoning as related to the present invention is included tofacilitate a better understanding of the present invention.

Control Agent Based Software

A control agent is an independently executing controller that is coupledto and responsible for an identifiable subsystem within a larger system,that is, at least in part, controlled by other peer control agents. Acontrol agent is capable of perceiving changes in the subsystem,identifying the state of the subsystem, planning actions in accordancewith the current subsystem state and/or desired subsystem goals, andexecuting planned actions.

OAK control agents are intrinsically permanent, stationary, exhibit bothreactive and deliberative behavior, and are declaratively constructed.Control agents are reactive in their ability to reconfigure thesubsystems within their control in the context of an existing task plan.Control agents are deliberative in their ability to create a task planin response to observed states and defined goals. OAK control agents'extrinsic characteristics include proximity to the controlled subsystem,social independence, and both awareness of and cooperativeness withgoals and states of other control agents. OAK control agents arecomprised of nearly homogeneous control agents, and are independentlyexecuted yet contain unique models of the subsystem for which thecontrol agent is responsible. OAK control agents are environmentallyaware and behavior of the environment is predictable through eachcontrol agent's model.

Model-based Reasoning

OAK control agents employ Model-based Reasoning (MBR) to deduce thecondition of their underlying subsystems. Model-based reasoning, asapplied to OAK, can be thought of as:

-   -   reasoning about a subsystem's behavior from an explicit model of        the mechanisms underlying that behavior. Model-based techniques        can very succinctly represent knowledge more completely and at a        greater level of detail than techniques that encode experience,        because they employ models that are compact axiomatic systems        from which large amounts of information can be deduced.

In OAK, a particular flavor of MBR called “Qualitative Model-BasedReasoning” is used. This branch of MBR focuses upon using symbolic(rather than quantitative) representations of system behavior in orderto simplify the modeling process and enable reactive-level responsetimes. First, Model-based programming techniques are used to develop aset of qualitative, first-principles component models for elements ofthe controlled system. Multiple instances of these component models arecombined to produce larger aggregate models of an entire system. Then,these aggregate models are used at runtime by an MBR kernel to producesystem diagnoses. This kernel is able to resolve complex system-levelinteractions between components to produce a best estimate of systemstate. By placing the responsibility of resolving these complexinteractions upon the MBR kernel, the human modeler is freed from havingto explicitly reason through and encode all possible diagnosis a priori.

First Principles Modeling

First principles models are interconnected finite state automata (FSA)in which interactions among connected FSA are explicitly enumerated byassociating attributes used to define states. Model-based reasoning(MBR) is state estimation or planning of first principles models basedupon resolving conflicts between state definitions and the values ofattributes used to define states.

Discrete event system theory shows that systems may be modeled using anautomaton G=(X, E, ƒ, Γ, x_(o), X_(m)), in which X is the set ofdiscrete states, or state space; E is the finite set of events; ƒ is thetransition function; Γ is the active event function mapping X to E;x_(o) is the initial state and X_(m) is the set of marked states.Control of G is provided by a control policy that includes a set ofcontrol actions.

Automata that are memoryless are considered Markov Processes.Association of control actions, costs and probabilities with statetransitions allows us to derive Markov Decision Processes (MDP). MDP aredefined by the tuple (X_(S), C, E_(C), ƒ_(T), R) in which: X_(S) is thefinite set of states of the system being tracked; C is the set ofpossible commands of which c is an element (cεC); E_(C) is a finite setof commanded events described in the form (x_(I), c, x_(F)) where x_(I)is the initial state and X_(F) is the final state; ƒ_(T) is a statetransition model of the environment which is a function mappingX_(S)×E_(C) into discrete probability distributions over X_(S), and R isthe cost function over E_(C). The actions are non-deterministic, so wewrite ƒ_(T)(x, e), in which (xεX_(S)) and (eεE_(C)), for the probabilitythat transition e will occur given the current state x.

OAK is required to reason with the possibility of component failures, sothe commanded event set E_(C) is replaced by the set of possible eventsE_(P) that includes the commanded set and the failure events E_(F).(E_(P)=E_(C)∪E_(F)) Note that failure events may be commanded orspontaneous, and are assumed permanent and non-recoverable. The statetransition function ƒ_(T) is extended to X_(S)×E_(P), providing us witha failure sensitive MDP (X_(S), C, E_(P), ƒ_(T), R).

By way of example, a failure sensitive model of a simple circuit breakermay be described as follows:

-   -   M(breaker):        -   X_(S): {open, shut, stuck-open, stuck-shut}        -   C: {open, shut, none}        -   E_(P): {(open, shut, shut), (shut, open, open), (open, shut,            stuck-open), (open, none, stuck-shut), (shut, open,            stuck-shut), (shut, none, stuck-open)}        -   ƒ_(T): ρ(open, shut, shut), ρ(shut, open, open)≅1,            -   ρ(open, shut, stuck-open), ρ(shut, open,                stuck-shut),≅0.01;            -   ρ(*, *, stuck-open), ρ(*, *, stuck-shut),≅0.0001;        -   R: {r_(open), r_(shut)=1, r_(none)=0}

The behavior a system exhibits in state x, where xεX_(S), is defined byattributes. Attributes are physical properties of the system.Propositional statements containing these attributes are used to expresssystem behavior in each state. The behavior of our circuit breaker maybe defined through the following propositions:

-   -   (breaker.state=(open        stuck-open))→conduct    -   (breaker.state=(shut        stuck-shut))→        conduct

Because internal sensors are subject to faults, and the measure of asystem is its effect on the systems it services, OAK is interested inthe behavior of our system as it relates to the outside world. Externalinfluences are modeled as a special class of attributes known asinterfaces. The circuit breaker has two such attributes: current_(in)and current_(out). Using these interfaces behavior definitionpropositions are rewritten as:

-   -   (breaker.state=(open        stuck-open))→(current_(in)=current_(out))    -   (breaker.state=(shut        stuck-shut))→(        current_(in)        current_(out))

A system containing multiple related components may be modeled byassociating component model interfaces in propositional statementsdeclared in the system model. A system model is represented by thetuple: M_(system)=(M_(comp), A, P) in which M_(comp) is a set of failuresensitive component models, A is a set of attributes defined by thesystem, and P is a set of propositions associating system definedattributes A and M_(comp) interfaces. A small system comprised of twocircuit breakers connected in series may be represented as follows:

-   -   M(system):        -   M(breaker₁), M(breaker₂): M(breaker); //defines breaker₁ and            breaker₂ to be “breakers”        -   breaker₁.current_(out)=breaker₂.current_(in)

Partially Observable Markov Decision Processes (POMDP) are MDP that havebeen extended to include a finite set of observations. Failure sensitivePOMDP are represented by the tuple M=(X_(S), C, E_(P), O, ƒ_(T), R). inwhich O is the observation function that maps X_(S) to the finite set ofobservations. However, states have been defined as abstractionsrepresented in terms of attributes. As abstractions, states are notdirectly observable. The ability to observe a state is associated withthe ability to observe the attributes that define the state. A state isobservable if all of the attributes in any clause of a disjunctivelyformed state behavior definition are observable. A state may beconsidered partially observable if it is not observable and at least oneattribute is observable. The observation function O in traditional POMDPis replaced by an observation function O_(A) that maps the attribute setA to the finite set of observations. The probability of making anobservation o from the attribute a is denoted as O(o,a).

Model Based Estimation

Estimation methods have been demonstrated for systems of componentsmodeled as failure sensitive POMDP. The first step in estimation is todetermine a belief state for the system. Assuming an initial state X_(I)of the components, and a set of commands c⊂ C applied to the systemcomponents model-based estimation makes the naïve assumption that themost probable component transitions in E_(C) have occurred. The beliefstate X_(B) for all components is easily derived. A single propositionalstatement that describes the believed values of attributes within thesystem may be generated by conjoining the propositional clausesassociated with these belief states, general propositional clausesshowing component relationships and known attribute values forinterfaces external to the system. For example, consider the simple twocircuit breakers in series model:

-   -   M(system): {        -   M(breaker₁), M(breaker₂): M(breaker);        -   breaker₁.current_(out)=breaker₂.current_(in);        -   breaker₁.current_(in)=external stimulus;}    -   breaker₁.initial_state=open;    -   breaker₂.initial_state=closed;        Given the command breaker₂.command open and an externally        provided flow (external stimulus=true) then the state set        {breaker₁.state=open; breaker₂.state=open} is realized and the        following propositional statement presents itself:    -   (external stimulus=true)        (breaker₁.flow_(in)=external stimulus)        (breaker₁.current_(out)=breaker₁.current_(in))        (breaker₂.current_(out)=breaker₂.current_(in))        (breaker₁.current_(out)=breaker₂.current_(in)).

One can see how a value for an attribute, in this case our externalstimulus propagates through the system inferring values on the currentsand current_(out) attributes in breakers one and two.

Observed attribute values generate additional observed propositionalclauses (e.g., breaker₂.current_(out)=true). If the belief state isincorrect then a conflict between the observed clause and one or more ofthe belief state clauses will be generated. If no conflict is generatedthen the belief state is confirmed. A solution to a conflict, or set ofconflicts, is defined as a set of alternate transitions E that generatepropositional statements that do not conflict with observation clauses.For example, in the breaker example if a single transition e: (shut,open, open) is applied to the above initial state andbreaker₂.current_(out)=false is subsequently observed a conflict will begenerated. If the transition set is replaced by either of the sets

-   -   E={breaker₂.transition(shut, open, stuck-shut)} or    -   E={breaker₂.transition(shut, open, open),        breaker₁.transition(open, none, stuck-shut)}        these transitions will generate a belief state that does not        conflict with the observation clause. Using the transitional        probabilities found in the transition function ƒ_(T) we can        apply Bayes' rule to the probabilities of our two candidate        solutions to determine their relative probabilities. The        generation of possible solution sets can be accomplished using        Conflict-directed Best First Search (CBFS).

When multiple candidate solutions satisfy a set of conflicts, the mostlikely solution is selected as a basis for reconfiguration andsubsequent estimation. To protect the estimation engine from permanentlyadopting an incorrect solution, a Truth Maintenance System is used totrack likely solutions. If future observations generate clauses thatprovide support to alternative solutions, the TMS generates a historicalrevision of the belief states.

Encapsulation and Abstraction

Large systems require models that, due to their complexity, aredifficult to process using CBFS and TMS. This complexity is mitigatedthrough encapsulation, decomposition and distribution. A system may bedecomposed by encapsulating portions of the system model into subsystemsof modest size (<25 components). Subsystems inherit component and systemsemantics. The components of the system, the subsystem attributes, andthe propositional statements used by the subsystem to associatecomponents are represented by (M_(comp), A, P). Subsystem automata(X_(S), C, E_(P), O_(A), ƒ_(T), R) are associated with subordinatecomponent automata M_(comp) by defining the subsystem states X_(S) usingpropositional statements over subordinate component attributes. Theentire subsystem is represented by the rather large tupleM_(sub)=(X_(S), C, E_(P), O_(A), ƒ_(T), R, M_(comp), A, P, A_(I)) inwhich A_(I) is the interface set for the subsystem. This structureprovides for subsystems to exhibit automaton-like behavior at anabstract level; however, it is not necessary that they do so. Subsystemsthat contain null sets for any or all of the abstract automata elements(X_(S), C, E_(P), O_(A), ƒ_(T), R) are acceptable and often desirable.

If a subsystem is believed to be in a state, then the propositionalstatement associated with its state is applicable. Likewise if asubsystem is believed to be in a state, then the propositionalstatements that define the subordinate component states are alsoapplicable. The set of applicable statements may be conjoined andreduced to create a single belief statement for the subsystem. Using thecombined belief statement, CBFS and TMS may be applied to performestimation and reconfiguration within the scope of the subsystem.

Because subsystems inherit the properties of component models, asubsystem model may be used as a component of another subsystem. Thisparent-child relationship permits building topologies of subsystemsfeaturing numerous models at varying levels of abstraction. Parent-childrelationships are expressed by associating the behavior, in terms of thechild's interfaces and goals, within propositions internal to theparent.

Control with an independent CBFS and TMS for each subsystem is completeif and only if the subsystem behavior represented by the subsystem'sproposition statement set is independent. Unfortunately this is notusually the case. A subsystem's interface attributes are, by definition,constrained to the world outside of the subsystem. This problem islimited by enforcing design constraints that remove some of the moredifficult cross-subsystem dependencies. Design constraints are focusedon eliminating the possibility of conflicts being generated byobservations and propositional clauses that are difficult to resolveacross subsystems, and also eliminating the possibility of goals beingdirected to a subsystem that cannot be directly addressed within thesubsystem. While these constraints are limiting, experience has shownthat robust, useful models are still a possibility. The designconstraint is described in terms of conflicts generated during theestimation process. Because our reconfiguration strategy is a mirror ofour estimation strategy, equivalent design criteria may be used tosimplify reconfiguration. Conflicts may be divided into two distincttypes: independent conflicts are identifiable by observation-propositioncombinations that are contained within a single subsystem; dependentconflicts are generated by behavior that is not identifiable byobservation-proposition combinations within a single subsystem.

Collaboration between independent subsystems is straightforward.Observations are made by independent subsystems that resolve theconflict. After the conflict is resolved the effects of the observationare propagated to the system and other subsystems via interfaces.

As previously stated, OAK is a distributed, multicontrol agent system.An OAK system can have a varying topology based on the overall systemapplication. OAK has two major use-cases that almost fully describe theoperation of the system: OAK's reaction to user-input goals; and OAK'sreaction to system state change. The primary intelligent components thatenable OAK to accomplish these use-cases are a model-based reasoningengine and a planner.

To perform the diagnostic phase of the control cycle, OAK control agentscontinually update their states using the model-based reasoning engine,and pass these state updates to other control agents that are interestedso that these control agents may update their states. In response tothese states or to the system's environment, an external actor or an OAKcontrol agent will provide goals to the OAK system, which aredistributed for further processing.

FIG. 1 illustrates one example of a logical OAK hierarchy. The circles10 represent control agents, and the diamonds 12 represent hardware. Theconnecting lines represent parent-child control agent relationships 14and connections to hardware 16. User-defined goals flow down thehierarchy while component states generally flow upwards in thehierarchy.

While a hierarchical topology is the topology that is presented herein,the OAK architecture does not preclude other topologies. Othertopological possibilities include: peer-to-peer, where each controlagent communicates facts and goals to any other; multi-hierarchy, wherethere exist multiple loosely coupled hierarchies; and a star topology,or one-level hierarchy. For the rest of this discussion a hierarchicaltopology is assumed.

Each control agent has an associated Control agent Communication Broker(ACB), which is responsible for handling all of the control agent'scommunication with a control agent Communication Framework (ACF). TheACB maintains a queue of messages coming into the control agent.Additionally, any control agent that has direct communication withhardware has a control mediator (CM) to handle the hardware level goalsthat are generated by these control agents for the associated hardware,and to receive updates about this hardware. These messages are nothandled by the ACF. By using this layered approach, control agents aredecoupled from the ACF.

The Control Agent Communication Language (ACL) of OAK provides severalmessage templates, including messages for queries, state updates,subscription requests, goals, exceptions, and control agentcoordination. To protect information, each control agent's ACB mayoptionally have an information access matrix that determines whichcontrol agents are allowed to subscribe to that control agent's events.The ACF allows any control agent to communicate directly with any othercontrol agent. Thus, communication between control agents is notrestricted to any logical hierarchical framework. The ACF also supportsmessage logging.

One of the major use-cases of OAK is to react to goals entered by anexternal actor. These are system-level commands which have the potentialof transitioning the entire multicontrol agent system from one state toanother.

OAK communicates with external actors via a graphical interface, whichis also decoupled from the rest of the system. Goals that are enteredfrom an external actor, such as a human operator, through this interfaceare sent directly to a root level control agent using a goal message.This control agent develops a plan with goals that apply to the domainsof its child control agents. Goals have a priority associated with them,which is used for goal preemption.

After the root level control agent develops a plan and directs a goal toone of its child control agents, the goal is received by the childcontrol agent's ACB, sorted into its queue, and eventually accepted bythe control agent for processing. This control agent develops a plan toimplement the goal. Since this control agent is a root of its own tree,the goals developed by its planner are passed to its children controlagents. This propagation continues until leaf control agents at thehardware level receive goals for their specific domains.

Once goals are received at the leaf level, a similar process occurs inthat a plan is developed and goals are passed out of the control agent.The only difference is that the goals are now passed to the controlagent's control mediator (CM), which translates the goal into commandsthat a hardware driver can understand. The CM has a queue of suchcommands in case the control agent is able to generate commands morequickly than the CM is able to deliver them. Since the CM is the onlycomponent that has direct interaction with the hardware drivers, it isthe only component that has to be updated when hardware itself ischanged or when hardware drivers are updated.

Successful goal implementation implies a state change, so a controlagent does not have to set up callbacks with the hardware to confirmthat a command was successful. Leaf control agents are already requiredto monitor the hardware they control for changes in order to accomplishthe second major use-case of OAK. Therefore, the leaf control agentwaits for a reaction from the hardware monitors to indicate that thecommand has been successful. The control agent is then free to pass outgoals that were temporally dependent on the goal just implemented. Sincestate changes are propagated up the hierarchy (as well as to unrelatedcontrol agents, potentially), control agents at higher levels are alsoinformed that their goals were implemented and they can then pass outgoals that had to be put in a wait state. To an implementer of OAK, thismeans that the incoming goal use-case and the state change use-case,which comprise the two major functions of OAK, are decoupled.

To appropriately handle changes in the state of the system, OAK uses amodel-based reasoning engine (MBRE). Recall that in OAK, any controlagent may subscribe to another control agent's events (which aretriggered by state changes) assuming that it has permission. The controlagent most likely to be interested, however, is the control agent'sparent, because the state of its children are reflected as variables inits model. For this reason, parents will subscribe to most of theirchildren's state change events. The external actor's control agent mayalso subscribe to any control agent's state change events, so that theuser can be advised of a state change at any level in the hierarchy.

State change events are transmitted through the use of a fact message.This message contains a representation of the knowledge contained in acontrol agent. Once a fact message is received, the control agent, usingthe MBRE, determines whether or not the change is important enough towarrant a state change. If the state changes, all subscribed controlagents are informed, and propagation of state changes begins asdiscussed above. Note that since many control agents may subscribe to anevent, state changes may be propagating in several subtrees at any giventime.

One of OAK's strengths is realized in a control agent's ability todetermine the state of its model, compare that state to a knowledgebase, and reactively plan. Thus, a control agent can autonomouslycontrol its domain until a control agent that is higher in the hierarchy(or in the case of the root control agent, the external actor's controlagent) preempts its control. The component of OAK that controls reactiveplanning is called the reactive manager.

There are two types of information in the reactive manager: persistentgoals and emergency conditions. Persistent goals are simply goals thatare desired true for the duration of the control agent. An emergencycondition is defined as a state transition that cannot be reversed andrequires OAK to act immediately to protect the resident system. When OAKdetects an emergency, it will preempt the external actor's goal and goto a predetermined goal that will minimize damage to the system beingmodeled. This goal will be implemented until the actor enters a goalwith ‘preempt’ priority. From this point on, goals from the user areimplemented as completely as possible based on the damage to the system.

As mentioned above, when a control agent receives a goal it must use aplanning mechanism to create a viable plan to achieve that goal. Theplan is in a format of an ordered sequence of fragments. Each fragmentis comprised of one or more subgoals. The idea is that within a fragmenteach subgoal may be accomplished in parallel, while subgoals in afragment prior to a given fragment must be completed before the currentfragment may be attempted.

Several different planners may be appropriate for different controlagents depending on how complex they are, what domain they are planningfor, etc. Therefore, the planner is instantiated at run-time differentlyfor each control agent from a group of developed planners. To date, twoplanners have been developed: the Scripted Planner and the highlyspecialized Graph-Based Planner.

The scripted planner is extremely simple but useful for simple controlagents, such as so-called leaf control agents. The scripted planneroriginally matched the incoming goal with a pre-defined list of incominggoals, and output a pre-defined plan. A later extension to the scriptedplanner allowed matching on the incoming goal and a propositional logicexpression about the current world-state. For example, Goal A along with(variable1=value1) A (variable2#value2) would generate plan A. Manydifferent propositional expressions, and therefore plans, could beassociated with each incoming goal. Also, since the scripts are checkedin a specific predefined order, a simple priority of outputted plans canbe imposed. Therefore, with these simple, essentially rule-basedscripts, complex behavior could be created.

The graph-based planner has been designed specifically for the testdomain described below. Test domain planning comprised determining howto move flow from a source to several sinks through a dynamic pipenetwork, with many operational constraints. This planner represented thetarget domain as a loadable model. The model was represented internallyas a digraph, with weights on each edge set according to the operationalconstraints. The planner operated by performing Prim's Minimum SpanningTree algorithm on the graph to determine how to get flow to as many ofthe desired sinks as possible. Along the way, the software determinedthe actions the control agent would need to take to align the system inthe manner that Prim's algorithm output. Finally, the planner wouldgenerate a plan based on the actions determined.

OAK is a system of collaborative model-based control agents used forcontrol. OAK's greatest strength is realized in a control agent'sability to determine the state of its model, compare that state to itscurrent goal, and reactively plan. Thus, a control agent autonomouslycontrols its domain.

Referring again to FIG. 1, each control agent performs reasoning andplanning at various levels of abstraction on its own virtual machine.This allows for logical distribution of time-intensive operations likeplanning. A higher-level control agent, A₁ for example, will develop arelatively abstract plan for accomplishing some goal without having toworry about its implementation. Instead, each goal in that plan ispassed to lower level control agents that further decompose the detailsof that goal. This propagation continues with plans becomingincreasingly detailed until they reach control agents (A₁₁₁, A₁₁₂, A₁₁₃)that can translate the goals directly into hardware commands. It isimportant to note that while this architecture implies a hierarchicalplanning structure, the planning at individual levels is not restrictedto the Hierarchical Task Planning approach. This design also allows fora high degree of modularity, since control agents that have the sameinterface can replace each other.

FIG. 2 illustrates a sample OAK communication network of control agents.Communication is not limited to the logical channels shown in FIG. 1.Rather, any control agent 22 may communicate to any other control agent22 via its Agent Communication Broker (ACB) 24, through the AgentCommunication Framework (ACF) 26. Thus, any control agent may subscribeto any other control agent's notifications. Control agents areassociated with subsystems or high level models. Each type of controlagent may operate with data and goals provided by other control agents.Within OAK, each control agent associated with a subsystem providesclosed-loop subsystem management using sensed parameters to infer systemstate. The control agents can adjust control commands to achieve adesired operating profile and/or respond to external events or componentfailure. Each control agent can communicate state information and goalsto other control agents using a “publish and subscribe” mechanism tolimit bandwidth usage.

Each control agent in an OAK architecture continually performs a cycleof mode estimation, planning, and execution that is influenced byobserved subsystem state and received goals. This cycle is implementedusing various manager components in each control agent.

The architecture of a control agent 22 is illustrated in FIG. 3. Controlagent architecture includes managers for each major step in thefunctioning of the control agent.

The executive manager 32 acts as a gateway (analogous to OSI level 6)for the control agent 22. It is responsible for decoding and executingincoming messages, as well as encoding outgoing messages. As such, it isthe only manager that communicates directly with the ACB 24. The ACB 24communicates through the ACF 26 to other control agents 22.

OAK control agents attempt to handle large amounts of information in anefficient manner. While the planner is planning for a goal, the controlagent can still accept facts and resolve queries. As this is happening,there may be a change in state that may preempt a plan already inprogress. Multi-tasking within control agents is handled by making eachOAK message object a separate thread.

One type of message object is the query object. Query objects arecreated when the control agent receives a state query. It interfaceswith an MBRE manager, which stores the current state of the model. Thequery object causes creation of a fact object addressed to the queryoriginator if the queried state is recognizable or an exception if it isnot.

Fact objects are another type of message object. In addition to thescenario above, a fact object is created when state updates from othercontrol agents or from associated hardware are received. State updatesfrom other control agents are received via the ACB, and state updatesfrom associated hardware are received via the Control Mediator (CM).

The third type of message object is the goal object. Goal objects aresent from higher level control agents to communicate their desire forthis control agent's subsystem. Goal objects interface with the MBREmanager to get the control agent's current state, the Planner to receivethe proposed plan, and the plan implementation manager to execute theplan.

The fourth type of message object is the subscription object. Theseobjects contain a reference to a control agent, and a particular statevariable that that control agent is interested in. After receiving asubscription object, the receiving control agent will send fact messagesto the subscribing control agent whenever the information that it wasinterested in changes. To this end, the subscription object interfaceswith the control agent's subscription manager.

A fifth type of message object is the administrative command object.These objects contain commands that deal with the control agentsoftware's behavior outside the scope of the intelligent control system:things like stopping a control agent, printing debugging messages, orresetting the control agent.

In addition to message objects, there are also seven managers within anOAK control agent. The executive manager 32 routes all of the incomingand outgoing messages for a control agent. The MBRE manager 34 existssolely to store the current state of the model and to interface thequery, fact, and goal objects with the model-based reasoning engine(MBRE) 36. The MBRE manager 34 is the component that handles the modeestimation portion of the OAK control cycle. The ACL manager 38translates each incoming message from the ACL into an object that thecontrol agent can use.

Each incoming goal that results in a successful plan must alsocommunicate with the plan implementation manager (PIM) 40, which ensuresthat no step of a plan is implemented before each of the previous stepsis successful. The PIM 40 corresponds to the execution phase of the OAKcontrol cycle. This manager will have access to a matrix that stores theaverage transition time for each goal, a multiple of which is added to aconstant time to determine the amount of time that the control agentwill wait for an action to be performed. This makes the control agentsadaptive and can help detect problems with the system early on.

The subscription manager 42 handles subscription requests from othercontrol agents. It stores lists of control agents that are interested invarious state changes, and automatically generates fact messages to eachof the subscribed control agents when a given state variable changes.

The planner 44 is responsible for deducing plans from the current inputgoal and the current estimated state. The planner 44 corresponds to theplanning phase of the OAK control cycle.

Finally, the reactive manager 46 continually compares the controlagent's states to a vector of emergency states and a persistent goal todetermine whether or not the control agent's model has entered a statethat the developer has identified as abnormal or hazardous. In the eventthat the reactive manager 46 identifies that the control agent hasentered such a state, it pushes out a new goal for the control agent toimplement in order to prevent serious damage to the system.

The reactive manager 46 works by having a list of configurations of thesystem in which the reaction will occur. Each configuration is apropositional expression of facts, and can use the AND, OR, and NOToperators as well as arbitrarily nested parenthesized clauses.Associated with each proposition is a goal. When the control agent'smodel is in a state that is consistent with one of the configurations,the reactive manager 46 will fire and give the control agent the goalassociated with that configuration.

To determine whether or not the system is consistent with one of thelisted configurations, the reactive manager 46 will compare the systemstate to each of the propositions in the list each time the system'sstate changes. The configurations in the list will have a strictordering, so some propositions are guaranteed to be checked beforeothers later in the list. If a proposition is consistent with the systemstate, the associated goal is sent to the executive. This goal will betreated the same as any incoming goal, including preempting the currentgoal if the precedence of the current goal is sufficiently low. Uponreceiving a reactive goal, the executive may send an exception-typemessage into the ACF.

In addition to emergency states, the reactive manager 46 also stores apersistent goal. This goal will be given by the developer at startup,but may be changed by the system user via special configuration messagesduring run time. Consistency with the persistent goal is checked beforethe reactive manager 46 checks for emergency conditions, and if thestate is found to be inconsistent, the control agent reacts as describedabove. The user should not, however, consider leaving the persistentgoal's state to be catastrophic, as it is assumed to be in the emergencycase.

Mode Estimation (ME) is the process of deducing system states based uponpartial or incomplete information. Within the context of OAK, ModeEstimation involves inferring the states of various system componentsusing: first-principles logic-based models, observations gathered fromthe hardware, and knowledge of past commands issued to the hardware.Internal to a control agent, this capability is provided by theREManager (Reasoning Engine Manager), which uses a single underlyingreasoning engine to produce diagnostic estimates or “candidates”. Thesecandidates are then converted into fact objects, which are the atomicunits of knowledge representation in OAK. Fact objects are passedbetween control agents, and consequently, between reasoning engines.

Each time a command set is issued to the hardware, or new observationsarrive from the hardware in the absence of an explicit command, a statetransition occurs and the reasoning engine attempts to determine themost likely candidates (resultant system states). The process ofgenerating candidates comprises a number of steps. First, the reasoningengine generates an expected next state for each component based uponany command information that may be available. Next, the reasoningengine considers all recent observations from the system. Aconsistency-checking algorithm is then run against the plant model giventhe new observations. The output from this algorithm is a set of rankedcandidates, or state estimates. It is then the responsibility of theexternal actor who is using the reasoning engine to decide whichcandidate it wishes to select or “believe”.

The reasoning engine supports a number of parameters that are set whenthe engine is first initialized. These parameters are used to configurethe estimation process, and different settings can potentially result indifferent diagnoses being reached. Table 1 illustrates an example ofreasoning engine settings to be used with OAK control agents.

TABLE 1 Parameter Setting Search engine type CBFS (Conflict-directedBest First Search) Number of candidates 3 Maximum number of candidates10000 to search over History length 10 Maximum number of trajectories 10Progress style Full Trajectory tracker type Extend

OAK receives high level goals from an external actor and creates animplementation plan for these goals. The implementation is carried outby a plurality of control agents in a distributed manner utilizingmodel-based reasoning techniques. Moreover, OAK reacts autonomously tochanges in the system. The distributed nature of OAK permits theaforementioned functions to be realized in a time efficient manner. Inaddition, OAK can respond to catastrophic changes in the systemenvironment including damage to the OAK system itself. All subsystems ofan OAK system, including the intelligent components, are decoupled fromone another wherever possible allowing for many different suchsubsystems to be utilized. Ultimately, OAK controlled systems reducemanpower requirements which is especially significant in repetitive ordangerous domains.

In the following claims, any means-plus-function clauses are intended tocover the structures described herein as performing the recited functionand not only structural equivalents but also equivalent structures.Therefore, it is to be understood that the foregoing is illustrative ofthe present invention and is not to be construed as limited to thespecific embodiments disclosed, and that modifications to the disclosedembodiments, as well as other embodiments, are intended to be includedwithin the scope of the appended claims. The invention is defined by thefollowing claims, with equivalents of the claims to be included therein.

1. A system including a processor and a memory for controlling aconnection-based system in which a plurality of independently executingcontrol agents are distributed across an agent communication framework,each control agent being associated with a subsystem of theconnection-based system wherein each control agent is responsive toother control agents such that connection-based system goals andconnection-based system states can be propagated throughout thedistributed network of control agents and acted upon to autonomouslycontrol the connection-based system, said system comprising: in thesubsystem, hardware components that are subject to component failures;an agent communication framework that serves as a communication networkfor the plurality of control agents, wherein each control agent iscomprised of: an agent communication broker that provides acommunication interface between a control agent and the agentcommunication framework; a model based reasoning engine (MBRE)communicable with the agent communication broker, the MBRE being basedon failure sensitive component models of the hardware components thatinclude (i) physical properties based models on the hardware components,and (ii) failure states of the hardware components, wherein the MBREestimates a component failure based subsystem state based on datagathered from the hardware components and knowledge of the past commandsissued to said hardware components; and a planner communicable with theagent communication broker and the model based reasoning engine thatgenerates a reconfiguration plan for the implementation of a specificgoal based on input including the specific goal and the subsystem state,and wherein each control agent propagates commands for achieving thespecific goal that are issued to associated hardware components.
 2. Thesystem of claim 1 wherein at least one control agent is communicablewith an external actor that provides a system level goal to initiateconnection-based system-wide control.
 3. The system of claim 2 wherein acontrol agent is communicable with at least one hardware componentwithin the control agent's subsystem via a control mediator thatprovides an interface between the control agent and the hardware'sdevice controller.
 4. The system of claim 3 wherein said control agentis further comprised of: an executive manager communicable with theagent communication broker that routes incoming and outgoing messagesfor the control agent; an ACL manager communicable with the executivemanager for translating incoming messages into message objects, whereinmessage objects include a query object having data pertaining tosubsystem status queries, a fact object having data pertaining tosubsystem state updates, a goal object having data pertaining to thegoal of a control agent, and a subscription object having datapertaining to another control agent that is interested in the status ofthe present control agent; a model-based reasoning engine managercommunicable with the executive manager, said model-based reasoningengine manager for storing the current state of a subsystem model; andcommunicating query, fact, and goal objects between the model-basedreasoning engine and executive manager; a plan implementation managercommunicable with the executive manager for implementing a plangenerated by the planner; a subscription manager communicable with theexecutive manager for maintaining a list of other control agentsauthorized to receive messages from the present control agent handlingrequests from other control agents and generating fact message objectspertaining to subsystem state changes to said other control agents. 5.The system of claim 4 wherein said control agent is further comprised ofa reactive manager communicable with the executive manager formonitoring and comparing the current control agent's state against a setof emergency states and a persistent goal to determine whether thecurrent control agent's model has entered an abnormal or hazardousstate, wherein when the reactive manager determines that a controlagent's model has entered an abnormal or hazardous state, said reactivemanager generates a new pre-emptive goal for the control agent toimplement in order to prevent damage to the connection-based system.